Visual Speech Recognition Using Cepstral Images

نویسندگان

Eun-Jung Holden

Robyn Owens

چکیده

Automatic lipreading is important in various humancomputer interaction applications. Lipreading requires recognition not only of the mouth shape change but also of the appearance of the inner mouth (the teeth and the tongue). We have developed a lipreading system that can represent the changes of the mouth shape and the inner mouth appearance occurring throughout the input image sequence by producing a single 70 dimensional feature vector. This is achieved by generating the Cepstral coefficients of the pixel intensity change over time and arranging them as pixel intensities of a Cepstral image. Then the Higher Order Local Autocorrelation (HLAC) features are extracted from the Cepstral images and are used for classification. This paper explains the techniques used in the system, and reports on our feasibility study on the use of these techniques in automated lipreading.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

A link between cepstral shrinking and the weighted product rule in audio-visual speech recognition

The weighted product rule has been shown empirically to be of great benefit in audio-visual speech recognition (AVSR), for isolated word recognition tasks. A firm theoretical basis for the selection of effective weights is of considerable interest to the audio-visual speech processing community. In this paper a clear link is established between the selection of effective weightings and the appr...

متن کامل

A Link between Cepstral Shrink Product Rule in Audio-visual

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

ذخیره در منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Visual Speech Recognition Using Cepstral Images

نویسندگان

چکیده

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Voice-based Age and Gender Recognition using Training Generative Sparse Model

A link between cepstral shrinking and the weighted product rule in audio-visual speech recognition

A Link between Cepstral Shrink Product Rule in Audio-visual

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

عنوان ژورنال:

اشتراک گذاری